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DESCRIPTION 



WEB-ENABLED ELECTRONICS APPARATUS, WEB PAGE PROCESSING 
METHOD, AND PROGRAM 

Technical Field 

The present invention relates to a Web-enabled 
electronics apparatus, a Web page processing method, and 
a program, which are applied to electronics apparatuses 
such as PDAs, portable telephones, television sets that 
have a function of connecting to a network, and which 
process content on the Web for display optimized for 
their own display environment. 

Background Art 

It should generally be said that a man-machine 
interface for program-embedded. Web-enabled electronics 
apparatuses, such as PDAs (Personal Digital (Data) 
Assistants), portable telephones, television sets, is 
poor compared with that for personal computers. On the 
other hand, much of content on the Web is designed for 
browsing/display by personal computers that employ a 
mouse and a high-resolution display device. Thus, when a 
user tries to browse/display content on the Web using a 
Web-enabled electronics apparatus such as those mentioned 
above, the user could not help but encounter various 
inconveniences . 

For example, most Web-enabled electronics 
apparatuses adopt a lower-resolution display device than 
that of personal computers. As mentioned earlier, many 
of Web pages are designed for browsing/display with high- 



resolution display devices used for personal computers. 
Thus, as shown in, e.g.. Fig. 16, a low-resolution 
display device 162 with which a Web-enabled electronics 
apparatus 161 such as a PDA is equipped could display, in 
many situations, only a part 164 of a whole Web page 163 
at a time, imposing a heavy burden on the user in his or 
her operation, such as having to repeat scrolling 
vertically and horizontally to view the whole page. 

Methods of increasing the volume of information 
displayable on a small screen include methods of omitting 
images, kerning, wrapping characters depending on the Web 
browser, and a technique for selecting the optimal size 
of a character font for display according to a surface 
area of a display screen (see, e.g., Japanese Patent 
Application Publication No. 2002-156957 (paragraph [0065], 
Fig. 15) . 

Disclosure of the Invention 

However, depending on the Web browser, even with 
images omitted, kerning done, characters wrapped, the 
small screen of a PDA or the like can display only a part 
of a whole page at a time, anyway. Further, even using 
the technique for selecting an optimal size of a 
character font for display according to the surface area 
of a display screen, there also is a limit in the number 
of characters displayable on a single screen, and further, 
small characters would cause an adverse effect of making 
reading difficult . 

The present invention has been made to overcome 
these problems, and has an object to provide a Web- 
enabled electronics apparatus, a Web page processing 
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method, and a program, which can display a Web page 
acquired through a network by reconstruction into pages 
suitable for browsing in a low-resolution display 
environment, and reconstruct Web pages written in various 
5 types of languages at low costs. 

In order to achieve the above object, a Web-enabled 
electronics apparatus of the present invention includes 
Web page acquiring means for acquiring a first Web page 
including at least a headline and a body of story related 

10 to the headline, and Web page reconstructing means for 
extracting the body of story from the first Web page 
acquired by the Web page acquiring means to create a 
second Web page including this body of story, and 
extractinig the headline from the first Web page to create 

15 a third Web page including this headline and provided 
with a link to the second Web page. 

That is, this Web-enabled electronics apparatus has 
enabled browsing of the first Web page acquired via a 
network and including a headline and a body of story 

20 related to this headline, on separate screens by division 
into a headline Web page (the third Web page) provided 
with a link to the body of story, and a body-of-story Web 
. page (the second Web page) . As a result, it has become 
possible to efficiently browse the content of the whole 

25 part of a high-resolution Web page designed for personal 
computers without scrolling or with a small amount of 
scrolling in poor (low-resolution) display environments 
of mobile terminals such as PDAs. 

Further, in a case where the headline of the first 

30 Web page is constituted by a headline and subheads, and 
its body of story is constituted by a body of story of 
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the headline and a list of links to articles belonging to 
the subheads, a page of the body of story of the headline 
and a list page providing links to articles belonging to 
the subheads are created as the third Web page, and a 
5 page including the headline provided with a link to its 
body of story and the subheads provided with links to the 
link list page is created as the second Web page. As a 
result, if a headline is designated on the third Web page, 
a page of the body of story of that headline can be 

10 displayed, and when a subhead is designated, the list 
page providing links to articles belonging to that 
subhead can be displayed. Since each of the Web pages is 
provided in a manner having certain regularity as a whole, 
the user can eliminate trial and error in his or her 

15 operation for reaching a target Web page, whereby Web 

browsing straight to the content itself of the Web page 
becomes possible. 

Further, in the Web-enabled electronics apparatus 
of the present invention, the Web page reconstructing 

20 means includes display element position judging means for 
internally depicting the first Web page and judging 
positions of individual display elements on the first Web 
page on the basis of this depicted data, cluster 
classifying means for connecting closely related ones of 

25 the individual display elements in terms of layout 

together on the basis of the judged positions of the 
display elements for classification into several clusters, 
specific cluster discriminating means for detecting 
layout features of the individual clusters and 

30 discriminating clusters of the headline and of the body 

of story on the first Web page from the other clusters on 
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the basis of a result of this feature detection/ and 
means for forming groups each including clusters having a 
same character attribute which is a display element, 
calculating an average of numbers of characters within 
5 the respective clusters included in each of the groups, 

and determining a group having a high average as the body 
of story and a group having a low average as the headline, 
as to the discriminated clusters of the headline and of 
the body of story. 

10 Although there are various types of Web page 

description languages, such as HTML, XHTML, XML + CSS, 
according to the present invention, Web pages can be 
reconstructed as long as they are described in page 
description languages being interpretable and renderable, 

15 and thus costs required for page reconstruction can be 

suppressed compared with a Web page reconstructing method 
involving a semantics-based analysis of tags. 

Furthermore, in the Web-enabled electronics 
apparatus of the present invention, the specific cluster 

20 discriminating means determines a vertical line on a page 
which crosses a largest number of the display elements as 
a center-of -gravity line, judges layout features of the 
individual clusters from at least any of leftward, 
rightward, middle, using this determined center-of - 

25 gravity line as a reference, and discriminates clusters 
with a feature thereof judged as being middle from the 
other clusters as the clusters of the headline and of the 
body of story. 

In most Web pages, major content is laid out in the 

30 middle of a horizontal axis of a page. The vertical line 
on the page which crosses the largest number of display 



6 



elements can be considered as a position on the 
horizontal axis on the page wherein the major content is 
laid out, and if the layout features of individual 
clusters are judged from at least any of leftward, 
5 rightward, middle, setting this vertical line as a 

center-of-gravity line and using this center-of-gravity 
line as a reference, then it is possible to discriminate 
clusters with their feature judged as being middle from 
the other clusters as clusters of a headline and of a 

10 body of story with high accuracy. 

Further, a Web page processing method according to 
another aspect of the present invention is a Web page 
processing method for a Web-enabled electronics apparatus 
having a processing/computation section and a display 

15 section for displaying Web pages, which method includes a 
step of acquiring a first Web page including at least a 
headline and a body of story related to this headline 
through a network, a step of extracting the body of story 
from the acquired first Web page by processing/computing 

2 0 by the processing/computation section to create a second 
Web page including this body of story, and a step of 
extracting the headline from the first Web page by 
processing/computing by the processing/computation 
section to create a third Web page including this 

25 headline and provided with a link to the second Web page. 
That is, the Web page processing method of this 
invention has enabled browsing of the first Web page 
acquired through a network and including a headline and a 
body of story related to this headline, on separate 

30 screens by division into a headline Web page (the third 
Web page) provided with a link to the body of story, and 
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a body-of -story Web page (the second Web page) • As a 
result, it has become possible to efficiently browse the 
content of the whole part of a high-resolution Web page 
designed for personal computers without scrolling or with 
5 a small amount of scrolling in poor (low-resolution) 
display environments of mobile terminals such as PDAs. 

Further, in the Web page processing method of the 
present invention, the processing/computation section is 
configured to internally depict the first Web page, judge 

10 positions of individual display elements on the first Web 
page on the basis of this depicted data, connect closely 
related ones of the individual display elements in terms 
of layout together on the basis of the judged positions 
of the display elements for classification into several 

15 clusters, detect layout features of the individual 

clusters and discriminate clusters of the headline and of 
the body of story on the first Web page from the other 
clusters on the basis of a result of this feature 
detection, form groups each including clusters having a 

20 same character attribute which is a display element, 

calculate an average of numbers of characters within the 
respective clusters included in each of the groups, and 
determine a group having a high average as the body of 
story and a group having a low average as the headline, 

25 as to the discriminated clusters of the headline and of 
the body of story. 

Therefore, according to the present invention, Web 
pages can be reconstructed as long as they are described 
in page description languages being interpretable and 

30 renderable, and thus costs required for page 

reconstruction can be suppressed compared with a Web page 



reconstructing method involving a semantics-based 
analysis of tags. 

Furthermore, in the Web page processing method of 
the present invention, the processing/computation section 
5 is configured to determine a vertical line on a page 

which crosses a largest number of the display elements as 
a center-of-gravity line, judge layout features of the 
individual clusters from at least any of leftward, 
rightward, middle, using this determined center-of- 

10 gravity line as a reference, and discriminate clusters 
with a feature thereof judged as being middle from the 
other clusters as the clusters of the headline and of the 
body of story. 

The vertical line on the page which crosses the 

15 largest number of display elements can be considered as a 
position on the above-mentioned horizontal axis on the 
page wherein the major content is laid out, and if the 
layout features of individual clusters are judged from at 
least any of leftward, rightward, middle, setting this 

20 vertical line as a center-of-gravity line and using this 
center-of-gravity line as a reference, then it becomes 
possible to discriminate clusters with their feature 
judged as being middle from the other clusters as 
clusters of a headline and of a body of story with higher 

25 accuracy. 

Furthermore, a program according to another aspect 
of the present invention is to cause a computer to 
function as a Web page acquiring means for acquiring a 
first Web page including at least a headline and a body 

30 of story related to this headline, and a Web page 

reconstructing means for extracting the body of story 
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from the first Web page acquired by the Web page 
acquiring means to create a second Web page including 
this body of story, and extracting the headline from the 
first Web page to create a third Web page including this 
5 headline and provided with a link' to the second Web page. 

According to the program of this invention, it has 
enabled to browse the first Web page acquired via a 
network and including a headline and a body of story 
related to this headline, on separate screens by division 

10 into a headline Web page (the third Web page) provided 

with a link to the body of story, and a body-of-story Web 
page (the second Web page) . As a result, it has become 
possible to efficiently browse the content of the whole 
part of a high-resolution Web page designed for personal 

15 computers without scrolling or with a small amount of 

scrolling in poor (low-resolution) display environments 
of mobile terminals such as PDAs. 

Further, in the program of this invention, the Web 
page reconstructing means causes the computer to function 

20 as display element position judging means for internally 
depicting the first Web page and judging positions of 
individual display elements on the first Web page on the 
basis of this depicted data, cluster classifying means 
for connecting closely related ones of the display 

25 elements in terms of layout together on the basis of the 
judged positions of the display elements for 
classification into several clusters, specific cluster 
discriminating means for detecting layout features of the 
individual clusters and discriminating clusters of the 

30 headline and of the body of story on the first Web page 
from the other clusters on the basis of a result of this 
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feature detection, and means for forming groups each 
including clusters having a same character attribute 
which is a display element, calculating an average of 
niombers of characters within the respective clusters 
5 included in each of the groups, and determining a group 
having a high average as the body of story and a group 
having a low average as the headline, as to the 
discriminated clusters of the headline and of the body of 
story. 

10 According to the present invention, Web pages can 

be reconstructed as long as they are described in page 
description languages being interpretable and renderable, 
and thus costs required for page reconstruction can be 
suppressed compared with a Web page reconstructing method 

15 involving a semantics-based analysis of tags. 

Furthermore, in the program of the present 
invention, the specific cluster discriminating means is 
characterized as causing the computer to function as 
means for determining a vertical line on a page which 

20 crosses a largest number of the display elements as a 
center-of-gravity line, judging layout features of the 
individual clusters from at least any of leftward, 
rightward, middle, using this determined center-of- 
gravity line as a reference, and discriminating clusters 

25 with a feature thereof judged as being middle from the 

other clusters as the clusters of the headline and of the 
body of story. 

According to the present invention, it becomes 
possible to discriminate clusters with their feature 

30 judged as being middle as clusters of a headline and of a 
body of story with high accuracy. 
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Brief Description of the Drawings 

Fig. 1 is a block diagram showing an electrical 
configuration of a Web-enabled electronics apparatus 
5 according to an embodiment of the present invention. 

Fig. 2 is a diagram showing a configuration of 
modules of a page reconstructing program. 

Fig. 3 is a flowchart showing a procedure of the 
page reconstructing program. 
10 Fig. 4 is a view showing an input state of a URL to 

which an identifier for page reconstruction is appended 
in a mobile terminal . 

Fig. 5 is a view showing an example of an original 
Web page and a result obtained by clustering of each 
15 display element performed on the Web page - 

Fig. 6 is a view showing a result obtained by 
classification depending on meaning of a cluster in terms 
of layout. 

Fig. 7 is a flowchart showing a procedure for 
20 classifying clusters. 

Fig. 8 is a flowchart showing a procedure for 
determining a center-of-gravity line in the procedure for 
classifying clusters of Fig. 7. 

Fig. 9 is a view showing a specific example of 
25 determination of the center-of-gravity line. 

Fig. 10 is a flowchart showing a procedure for 
determining a meaning given to a cluster in terms of 
layout from among "leftward", "rightward", "unused" in 
the procedure for classifying clusters of Fig. 7. 
30 Fig. 11 is a view showing a specific example of a 

process of determining the meaning given to the cluster 
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in terms of layout of Fig, 10. 

Fig. 12 is a flowchart showing a procedure for 
determining the meaning given to the cluster in terms of 
layout from among "headline (including subheads)", "body 
(including links to articles)" in the procedure for 
classifying clusters of Fig. 7. 

Fig. 13 is a view showing an example of a 
reconstructed Web page . 

Fig. 14 is a flowchart showing a procedure for 
reconstructing a Web page (at the time a top page 133 
created) . 

Fig. 15 is a block diagram showing a configuration 
in a case of reconstructing a Web page on a server on a 
network. 

Fig. 16 is a view showing a state in which a usual 
Web page is displayed on a display device of a low- 
resolution. 

Best Modes for Carrying Out the Invention 
20 An embodiment of the present invention will be 

described hereunder with reference to the drawings. 

Fig. 1 is a block diagram showing an electrical 
configuration of a Web-enabled electronics apparatus 
according to an embodiment of the present invention. 
25 As shown in the figure, this Web-enabled 

electronics apparatus 100 has a CPU (Central Processing 
Unit) 1 as a processing/computation section, a main 
memory 2, a program/data storage section 3, a network 
interface section 5 that processes connection to a 
30 network 4 such as the Internet, a display device 6 that 
provides information visually to a user, a graphic 
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controller 8 that performs a rendering process to a 
screen of the display device 6 using a VRAM (video RAM) 1, 
a user interface controller 10 that processes input from 
a user's operation input section 9 such as a jog dial, 
5 and a bus 11 for transmitting signals among the above 
parts . 

The CPU 1 performs various computation processing 
and control using the main memory 2 as a work area, based 
on programs and data stored in, e.g., the program/data 

10 storage section 3, input from the operation input section 
9 by the user, and the like. The main memory 2 comprises 
a randomly readable and writable, high-speed memory, such 
as, e.g., a RAM (Random Access Memory). The program/data 
storage section 3 is a read-only or readable/writable, 

15 nonvolatile storage device, and is, e.g., a ROM (Read 
Only Memory), a flash ROM, a disk drive, or the like. 

The display device 6 is, specifically, a CRT 
(Cathode Ray Tube) , an LCD (Liquid Crystal Display) , a 
PDF (Plasma Display Panel) , an OEL (Organic 

20 Electroluminescence), or the like. The user's operation 
input section 9 is, specifically, a simple keyboard, an 
IR (Infrared) remote controller, a jog dial, push buttons, 
or the like. 

The network interface section 5 is, e.g., an analog 
25 modem, a LAN (Local Area Network) , ISDN (Integrated 
Services Digital Network) , ADSL (Asymmetric Digital 
Subscriber Line), FTTH ( Fiber-To-The-Home ) , Bluetooth, 
FOMA (W-CDMA) , or the like. 

This Web-enabled electronics apparatus 100 is 
30 provided with a function of reconstructing an acquired 
Web page into a form tailored to a display environment 
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such as the resolution of its own display device, for 
displaying and browsing. 

The program/data storage section 3 stores a basic 
program such as an OS (Operating System) for operating 
5 this Web-enabled electronics apparatus 100, as well as a 
page reconstructing program that executes reconstruction 
of Web pages under this basic program, a Web browser, and 
the like. These programs are loaded into the main memory 
2 for interpretation, execution by the CPU 1. 

10 Fig- 2 is a diagram showing a configuration of 

modules of the aforementioned page reconstructing program. 
As shown in the figure, a page reconstructing program 21 
is constituted by an adaptation proxy 31, an adaptation 
engine 32, and a clustering engine 33. 

15 Next, the procedure of this page reconstructing 

program 21 will be described. Fig. 3 is a flowchart 
showing a procedure of this page reconstructing program 
21. Note that a mobile terminal 100 such as a PDA is 
considered herein as an example of the Web-enabled 

20 electronics apparatus 100. 

First, in this mobile terminal 100, a URL is 
inputted by the user. At this time, as shown in Fig. 4, 
an identifier (example: "/??ID=index" ) 52 for page 
reconstruction is appended to the end of a URL (example: 

25 http://www.somewhere.com) 51, and then a page browsing 

request is inputted, whereby this request is given to the 
page reconstructing program as a request for page 
reconstruction (ST301) . 

In response to the request for page reconstruction, 

30 the page reconstructing program 21 starts the adaptation 
proxy 31, and delivers the URL thereto. The adaptation 
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proxy 31 downloads an original Web page 34 via the 
Internet in accordance with the URL, for delivery to the 
adaptation engine 32 (ST302) . 

The adaptation engine 32 stores a source code of 
5 the acquired Web page 34 in the main memory 2 in the form 
of a DOM (Document Object Model) tree 35, and internally 
renders (does not display) the Web page. Successively, 
the adaptation engine 32 finds the draw positions of 
display elements such as character strings and images on 

10 the Web page of interest, and stores the position 

information in combination with tags as tag/position 
information 3 6 in the main memory (ST303) . Note that the 
draw positions of the display elements change depending 
on the size of a character font, the number of characters, 

15 the size of an image, and thus that the draw positions 
are found taking into account the size of a character 
font, the nximber of characters, and the like for 
characters, and the size of an image, and the like for 
images . 

20 The DOM tree means a tree structure in which 

elements such as tags, characters, images of a whole page 
are made hierarchical to enable one to, e.g., search, 
edit the page using an application. Further, a DOM is an 
API (Application Programming Interface) for accessing XML 

25 (extensible Markup Language) documents as a set of node 
objects in a tree structure. APIs for XML documents 
include a SAX (Simple API for XML) besides the DOM. 

Thereafter, the adaptation engine 32 delivers the 
tag/position information 36 to the clustering engine 33 

30 to instruct the clustering engine 33 to perform 

clustering. The clustering engine 33 classifies tags 
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(display elements) on the Web page into several clusters 
by connecting visually closely related (close in 
distance) tags (display elements) together based on the 
tag/position information 36 (ST304), and stores 
5 information about the classified clusters in the main 
memory 2 as a cluster list 37. 

Reference character 70 of Fig. 5 denotes a result 
obtained from the clustering of display elements 61a to 
. 611 performed on an original Web page 60. Reference 

10 characters 71a to 711 denote individual clusters: 71b 
denotes a cluster of the headline 61b on the Web page; 
71c denotes a cluster of the body of story 61c of the 
headline; 71f, 71h^ 71j denote clusters of the subheads 
61f, 61h, 61 j, respectively; 71g, 71i, 71k denote 

15 clusters of portions of the lists 61g, 61i, 61k of 

articles belonging to the subheads, respectively. Since 
having no visual relation with the other display elements, 
the headline 61b and the subheads 61f, 61h, 61j are 
generated as the individual clusters 71b, 71f, 71h, 71 j, 

20 respectively. Further, the article lists 61g, 61i, 61k 

are generated as the clusters 71g, 71i, 71k, respectively, 
with one list being provided for each set belonging to a 
single subhead. Other than this, some display 
information is obtained as the clusters 71a, 71d, 711. 

25 Clustering techniques include a grid-based 

technique known in the field of 2D data mining (for 
reference : 

http : / /www . cs . ualberta . ca/'-zaiane/courses/cmput 695- 
00/papers/wave . pdf ) . 
30 Successively, the clustering engine 33 extracts 

layout features from the individual, generated clusters 
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71a to 711 to give them meanings in terms of layout. 
That is, as shown in Fig. 6, the clustering engine 33 
classifies the individual clusters 71a to 711 into five 
types of meanings, i.e., "leftward" (L) , "rightward" (R) , 
5 "headline (including subheads)" (H) , "body (including 
links to articles)" (B) , "unused" (U) (ST305) , and 
delivers the result to the adaptation engine 32. Details 
of this classification of clusters will be described 
later. 

10 Returning to Fig, 2, the adaptation engine 32 

reconstructs the Web page in accordance with the 
classification result of the clusters (ST306) , and stores 
reconstructed page information 38 in the main memory 2. 
Thereafter, the Web browser reads the reconstructed page 

15 information 38 stored in the main memory 2 for display on 
the screen of the display device 6 (ST307) . 

Next, the details of the method of classifying 
clusters will be described. 

Fig. 7 is a flowchart showing a procedure for 

20 classifying clusters; Fig. 8 is a procedure for 

determining a center-of-gravity line in the procedure for 
classifying clusters of Fig. 7; Fig. 9 is a specific 
example of determination of the center-of-gravity line; 
Fig. 10 is a procedure for determining a meaning given to 

25 a cluster in terms of layout from among "leftward", 

"rightward", "unused" in the procedure for classifying 
clusters of Fig. 7; Fig. 11 is a specific example of a 
process of determining a meaning given to a cluster in 
terms of layout of Fig. 10; Fig. 12 is a procedure for 

30 determining a meaning given to a cluster in terms of 

layout from among "headline (including subheads) ", "body 
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(including links to articles)" in the procedure for 
classifying clusters of Fig. 7. 

First, in ST701 of Fig. 7, the clustering engine 33 
determines a center-of-gravity line of a page screen, 
5 which serves as a reference for detecting the layout 

feature of each of the clusters. The center-of-gravity 
line means a line along which the largest number of 
display elements is arranged on a page screen and which 
extends along the Y axis. 

10 As shown by, e.g., the procedure of Fig. 8 and the 

specific example of Fig. 9, a specific method of 
determining the center-of-gravity line is as follows. 
First, grid lines 82 are set which equally divide the 
whole of a page screen 81 which has been clustered, into, 

15 e.g., 16 (4 x 4) areas (ST801) . As to, e.g., 4 (2x2) 
areas in the middle of the page, the number of display 
elements (P) present is counted for each of lines 
extending in the Y-axis direction at predetermined 
intervals (Ad) from either the left or right end (in the 

20 X-axis direction) (ST802), to determine a line extending 
in the Y-axis direction for which a maximum count Pmax is 
obtained, as a center-of-gravity line 83 (ST803-806) . 

After having determined the center-of-gravity line 
83 in this way, in ST702 of Fig. 7, a process of 

25 determining the meanings given to the individual clusters 
in terms of layout from among "leftward", "rightward", 
"unused" is performed. As shown by, e.g., the procedure 
of Fig. 10 and the specific example of Fig. 11, a 
specific method for this process is as follows. First, 

30 in, e.g., upper 12 (4 x 3) areas of the 16 (4 x 4) areas 
divided by the grid lines 82, of the clusters 71a, 71b, 



19 



71c, 71f, 71g, 71h, 71i, 71j, 711c crossing the center-of- 
gravity line 83, a line 121 extending in the Y-axis 
direction which takes the X coordinate of the left end of 
the most leftwardly projecting cluster (having the 
5 minimum X coordinate) (71c in this example) is judged as 
a left-hand borderline, and a line 122 extending in the 
Y-axis direction which takes the X-coordinate of the 
right end of the most rightwardly projecting cluster 
(having the minimum X coordinate) (71i in this example) 

10 is judged as a right-hand borderline (STIOOI) . As a 

result, the whole of the page screen 81 is divided into 
three areas with the left-hand and right-hand borderlines 
121, 122 as boundaries. 

Thereafter, the clustering engine 33 fetches 

15 information about a single cluster from the cluster list 
37 (ST1002) . This cluster information includes 
information about a display element constituting this 
cluster (tag, position information) . The clustering 
engine 33 determines a meaning given to the cluster in 

20 terms of layout from among "leftward", "rightward", 

"unused", based on this cluster information, as follows. 

First, if the cluster extends over both the left- 
hand borderline 121 and the right-hand borderline 122 
(YES at ST1003) , the clustering engine 33 classifies that 

25 cluster as an "unused" cluster (ST1007), and excludes it 
from the cluster list 37 (ST1008) . 

In a case where the cluster is completely included 
in an area leftward of the left-hand borderline 121 (YES 
at ST1004), then the clustering engine 33 classifies that 

30 cluster as a "leftward" cluster (ST1009) , and in a case 
where the cluster is completely included in an area 
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rightward of the right-hand borderline 122 (YES at 
ST1004), then the clustering engine 33 classifies that 
cluster as a "rightward" cluster (ST1009) , and excludes 
it from the cluster list 37 (STIOIO) . 
5 Further, in a case where the cluster is not 

completely included in the left-hand area of the left- 
hand borderline 121 (NO at ST1004) but crosses either one 
of the borderlines, the left-hand borderline 121 or the 
right-hand borderline 122 (YES at ST1005) , then the 

10 clustering engine 33 calculates a center-of -gravity line 
of the cluster (STlOll), and classifies the cluster 
depending on the proximity of the center-of -gravity line 
to either the left-hand borderline 121 or the right-hand 
borderline 122, i.e., as a "leftward" cluster in a case 

15 where the cluster is closer to the left-hand borderline 
121, or as a "rightward" cluster in a case where the 
cluster is closer to the right-hand borderline 122 
(ST1012), and excludes it from the cluster list (ST1013) . 
The above process is repeated for each one of the 

20 clusters registered in the cluster list (ST1006) . 

Those clusters not classified as any of "leftward", 
"rightward", "unused" clusters should be any of "headline 
(including subheads)", and "body of story (including 
links to articles)" clusters. This cluster 

25 classification is performed by a procedure shown in, e.g.. 
Fig. 12. 

First, the clustering engine 33 fetches information 
about a cluster from the cluster list and internally maps 
it out for rendering over the main memory 2 (ST1201), and 
30 then scans an area interposed between the left-hand 

borderline 121 and the right-hand borderline 122 shown in 
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Fig. 11 (ST1202). Successively, the clustering engine 33 
determines clusters having a common display attribute 
such as the size, color, style of a font, or background 
color (hereinafter termed "congeneric clusters"), as a 
5 group (ST1203) . 

Next, the clustering engine 33 selects two groups 
each having a large number of clusters, from the 
determined groups (ST1204) , and calculates, as to each of 
the groups, an average of the amounts of information, 

10 such as the numbers of characters, within its congeneric 
clusters (ST1205) . As a result, a group having a high 
information amount average (a large number of characters) 
is determined as a "body of story (including links to 
articles)", and a group having a low information amount 

15 average (a small number of characters) determined as a 
"headline (including subheads)" (ST1206) . 

Next, details of the reconstruction of a Web page 
will be described. 

The adaptation engine 32 reconstructs, as shown in, 

20 e-g./ Fig. 13, a top page 133 constituted by a headline 
131 and subheads 132, a links-to-articles list page 135 
having a group of links 134 to articles belonging to a 
subhead 132, an article page 136 constituted by the body 
of story of the headline 131 and articles belonging to a 

25 subhead 132, a body-of -story page 137, the body-of- 
story/article pages 136, 137, and the like. 

In the top page 133, in a case where the headline 
131 is selected by a user through operation of the jog 
dial or the like, a hyperlink set for that headline 131 

30 switches the page to a page displaying the body of 

story/article page 137. Further, in the top page 133, in 
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a case where an arbitrary subhead 132 is selected by the 
user, a hyperlink set for the selected subhead 132 
displays the links-to-articles list page 135 belonging to 
that subhead 132. Furthermore, when a link 134 to an 
5 arbitrary article is selected on this links-to-articles 
list page 135 by the user, the body of story/article page 
136 to which it is linked is displayed. In a case where 
the user wishes to display other body-of-s tory/article 
pages again, the user may return to the top page 133 or 

10 the links-to-articles list page 135 by using a return 
button of the Web browser or the like, and repeat a 
similar operation . 

The layout of these pages is set optimally for the 
display environment of the mobile terminal, such as the 

15 size, resolution of its display screen, beforehand. 

Fig. 14 is a flowchart showing a procedure for 
reconstructing a Web page (at the time the top page 133 
is created) . 

First, the adaptation engine 32 loads 

20 classification data on clusters (ST1401) . Successively, 

the adaptation engine 32 reads tags in a descending order 
from the original DOM tree (35 of Fig. 2) (ST1402), 
searches the tag of a headline or a subhead from the 
original DOM tree based on the classification data on the 

25 clusters (ST1404), and adds the tag of interest to the 
DOM tree of the page for reconstruction (ST1405) . If 
there is a next tag of interest in the original DOM tree 
(YES at ST1406) , then the adaptation engine 32 reads that 
tag by returning to ST502, and if the next tag of 

30 interest is not the tag of a headline (it is the tag of a 
subhead) (NO to ST1403) , then the adaptation engine 32 
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searches the tag of a subhead from the original DOM tree 
{ST1404), and adds the tag of interest to the DOM tree of 
the page for reconstruction (ST1405) . In this way, the 
adaptation engine 32 searches the tags of the headline 
5 and of the subheads for reconstructing the top page 133, 
and adds them to the DOM tree of the page for 
reconstruction to complete the page for reconstruction . 

Similarly, the links-to-articles list page 135 and 
the body~of -story/article page 13 6 can be created by 

10 searching, in ST1404, links to articles and a body of 

story/article from the original DOM tree on the basis of 
the classification data on the clusters, and adding, in 
ST1405, the tags of interest to the original DOM tree of 
the pages for reconstruction. And by setting links 

15 necessary for each of the reconstructed pages created in 
the above way, moves from one page to another such as 
shown in Fig. 13 can be realized. 

Thus, according to the present embodiment, Web 
pages designed for the display environment of personal 

20 computers can be displayed by conversion into a design 
tailored to the display environment of mobile terminals 
such as PDAs. Specifically, by reconstructing a Web page 
into a size (resolution) displayable at a time on the 
display screen of a mobile terminal, it becomes possible 

25 to browse the whole Web page without scrolling. Further, 
the main page is constituted by a headline and subheads, 
and if the headline is designated on this top page, the 
body-of -story page of the aforementioned headline of 
interest can be displayed, and if a subhead is designated, 

30 a list page providing links to articles belonging to the 
aforementioned subhead of interest can be displayed, for 
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example. Thus, each of the Web pages is provided in a 
manner having certain regularity as a whole, whereby Web 
browsing efficient for a user becomes possible. Further 
speaking, trial and error in operation for reaching a 
5 target Web page can be eliminated, whereby Web browsing 
straight to the content itself becomes possible. 

Further, according to the present embodiment, Web 
pages can be reconstructed if they are written in page 
description languages being interpretable and renderable. 

10 That is, although there are various types of Web page 
description languages, such as HTML (HyperText Markup 
Language), XHTML (extensible HyperText Markup Language), 
XML + CSS (cading Style Sheets) , the present embodiment 
can realize reconstruction of Web pages created in these 

15 various description languages under the same logic- By 

contrast, a method of reconstructing a Web page involving 
a semantics-based analysis of tags would require an 
analyzing program corresponding to each type of page 
description language, and also entail tremendous analysis 

20 time. ' Compared with such a method of reconstructing a 
Web page involving a semantics-based analysis of tags, 
the present embodiment can remarkably reduce costs 
entailed for page reconstruction. 

Further, the present embodiment creates 

25 reconstructed pages using the tags of an original Web 
page, whereby an advantage is provided that the 
reconstructed Web pages can be browsed directly using the 
existing Web browser. Further, Web pages can be 
reconstructed without dependence on the type of language 

30 (Japanese, English, or the like) and locale. 

Note that the page reconstructing program may not 
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only be used by incorporation into the Web-enabled 
electronics apparatus 100 , but also be provided as a 
program incorporatable into a personal computer and a 
computer for use as a server, through a storage medium 
5 and a communication medium. 

As shown in, e.g.. Fig. 15, it may be configured 
such that the adaptation proxy 31, the adaptation engine 
32, the clustering engine 33 which are modules 
constituting the page reconstructing program are 

10 incorporated beforehand into a server 152 existing on a 
network 151 such as a LAN (Local Area Network) or the 
Internet, and such that the server 152 acquires, in 
response to a request from a client 153 which is a Web- 
enabled electronics apparatus such as a PDA, a Web page 

15 designated by the client 153 from a Web site 154, and 
performs a series of processing for reconstructing the 
Web page, for distribution of the reconstructed page to 
the client 153 via the network 151. 

Further, it may alternatively be configured such 

20 that the components, namely, the adaptation proxy, the 

adaptation engine, the clustering engine are distributed 
to a plurality of servers, to allow the plurality of 
servers to perform the series of processing involved for 
the reconstruction of a Web page in cooperation with one 

25 another in a distributed manner. 

Note that the present invention is not limited to 
any of the above-mentioned embodiments, but may be 
embodied by appropriate modification within the scope of 
the technical idea of the present invention. 

30 

Industrial Applicability 
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As described above, according to the present 
invention, it has become possible to efficiently browse 
the content of the whole part of a high-resolution Web 
page designed for personal computers without scrolling or 
5 with a small amount of scrolling in low-resolution 

display environments, and also, to reconstruct a Web page 
described in various types of languages at a low cost. 



